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Abstract — Network-layer  capabilities  offer  strong  protection 
against  link  flooding  by  authorizing  individual  flows  with  un- 
forgeable  credentials  (i.e.,  capabilities).  However,  the  capability- 
setup  channel  is  vulnerable  to  flooding  attacks  that  prevent 
legitimate  clients  from  acquiring  capabilities;  i.e.,  in  Denial  of 
Capability  (DoC)  attacks.  Based  on  the  observation  that  the 
distribution  of  attack  sources  in  the  current  Internet  is  highly 
non-uniform,  we  provide  a  router-level  scheme,  named  DefAT 
(Defense  via  Aggregating  Traffic),  that  confines  the  effects  of 
DoC  attacks  to  specified  locales  or  neighborhoods  (e.g.,  one  or 
more  administrative  domains  of  the  Internet).  DefAT  provides 
precise  access  guarantees  for  capability  schemes,  even  in  the 
face  of  flooding  attacks.  The  effectiveness  of  DefAT  is  shown 
in  two  ways.  First,  we  illstrate  the  precise  link-access  guarantees 
provided  by  DefAT  via  ns2  simulations.  Second,  we  show  the 
effectiveness  of  DefAT  in  the  current  Internet  via  Interent-scale 
simulations  using  real  Internet  topologies  and  attack  distribution. 


I.  Introduction 

Current  service-flooding  attacks  rely  on  a  large  number 
of  compromised  machines  that  are  organized  as  a  “hot” 
network.  Typical  defense  mechanisms  that  attempt  to  provide 
service-access  guarantees  despite  such  attacks  assume  absence 
of  flooding  in  the  underlying  network  links.  Yet,  a  large- 
scale  attack  (e.g.,  a  “botnet”  with  millions  of  “bots”)  can 
flood  any  chosen  link  in  the  Internet.  In  particular,  defense 
mechanisms  deployed  at  links  near  or  at  a  network  edge  (e.g.. 
Firewalls,  IDSs)  can  be  easily  overwhelmed  by  such  attacks. 
Worse  yet,  legitimate-looking  attack  packets  can  evade  most  of 
traditional  techniques  for  handling  address  spoofing  attacks  at 
the  network  layer  (e.g.,  IP  tracebacks  [1],  [2],  ingress  filtering 

[3]). 

Capability-based  solutions,  whereby  distinct  packet  flows 
are  separately  authorized  through  capabilities  obtained  before 
flow  initiation  [4]-[6],  provide  congested  routers  with  an 
effective  way  to  prioritize  legitimate  flows  and  filter  out 
unwanted  traffic.  Though  promising,  these  solutions  are  still 
vulnerable  to  flooding  attacks  targeting  the  capability-setup 
channel,  known  as  the  Denial  of  Capability  (DoC)  attacks 
[7].  These  attacks  are  possible  because  the  initial  capability- 
request  packets  are  treated  as  best-effort  packets,  as  opposed 
to  the  subsequent  high-priority  packets  that  carry  capabilities. 
If  DoC  attacks  cannot  be  countered,  flow  authorization  via 
network-layer  capabilities  becomes  impossible,  and  all  access 
guarantees  become  meaningless  at  congested  routers. 

Previous  solutions  that  attempt  to  protect  capability  requests 


from  flooding  attacks  (e.g.,  mechanisms  based  on  aggregate 
request  rates  [6]  or  on  proof  of  work  [8]),  though  useful,  are 
insufficient  to  provide  dependable  link-access  guarantees  for 
legitimate  capability  requests.  For  example,  a  fair-queueing 
mechanism,  which  fairly  allocates  buffer  space  to  flow  aggre¬ 
gates  based  on  a  router’s  confidence  in  precise  identification 
of  traffic  origin  [6],  fails  to  provide  any  guarantee  of  link- 
access  (viz..  Section  VII- A).  Mechanisms  based  on  proof  of 
work  (e.g..  Portcullis  [8])  provide  only  weak  access  guarantees 
during  flooding  attacks  as  they  are  (at  best  linearly)  dependent 
on  the  number  of  global  attack  sources;  e.g.,  a  large  number 
of  bots  could  still  flood  a  chosen  link  despite  such  guarantees. 
These  previous  schemes  achieve  relatively  weak  guarantees 
because  they  assume  that  attack  sources  are  uniformly  dis¬ 
tributed  in  the  network. 

We  observe,  however,  that  malicious  hosts,  or  bots  are 
clustered:  some  domains  include  sufficiently  strong  security 
mechanisms  that  enable  them  to  counter  or  deter  contami¬ 
nation;  others  are  easily  contaminated  by  bots.  Non-uniform 
distribution  of  attack  sources  is  evident  in  a  variety  of  worm 
propagation  models  [9],  [10],  evolutionary  features  of  pre¬ 
vious  worms  such  as  CodeRed  I/II,  Nimda  and  Slammer, 
and  the  disbribution  of  spam-bots  [11]  (viz..  Section  VIII- A). 
This  non-uniformity  actually  enables  us  to  achieve  stronger 
guarantees.  To  be  meaningful,  these  guarantees  have  to  be 
independent  of  the  number  of  attack  sources  (i.e.,  the  size  of 
a  global  botnet).  In  the  worst  case,  they  can  only  depend  on 
attack  sources  in  defined  locales  or  neighborhoods  (e.g.,  an 
administrative  domain  or  a  set  of  domains  in  the  Internet). 
As  a  consequence,  competing  requests  for  a  capability  to 
a  congested  link  that  originate  outside  a  contaminated  lo¬ 
cale  should  be  unaffected,  or  only  minimally  affected,  by  a 
flooding  attack,  and  should  receive  strong  access  guarantees. 
In  contrast,  initial  capability  requests  originating  from  bot- 
contaminated  locales  should  receive  weaker  access  guarantees, 
namely  guarantees  that  depend  only  on  the  number  of  bots  in 
the  contaminated  locale  (but  not  on  all  bots  of  a  multi-domain 
attack  network).  In  short,  our  notion  of  dependable  access  to  a 
flooded  link  provides  differential  guarantees  for  the  capability 
setup  channel.  Differential  access  guarantees  are  desirable 
because  they  provide  incentives  for  employing  host  security 
measures  within  administrative  domains  that  prevent  botnet 
(and  other  malware)  contamination.  In  exchange,  uncontami¬ 
nated  domains  receive  precise  guarantees  of  link  access  for  the 
capability  setup  channel,  which  support  meaningful  network- 


link  and,  ultimately,  service-access  guarantees. 

Our  scheme  relies  on  three  basic  mechanisms.  First,  we 
define  a  new  path  identification  mechanism  that  provides 
an  unforgeable  domain  identifier  to  individual  packets,  and 
enables  remote  routers  to  identify  a  packet’s  domain  of  origin. 
Second,  we  define  a  dynamic  virtual  queueing  mechanism 
that  guarantees  a  minimum  number  of  router  buffer  slots  to 
domains  originating  flows  through  a  router,  which  in  effect, 
guarantees  link  access  to  those  domains.  Finally,  we  employ  a 
path  aggregation  mechanism  that  optimizes  router  bandwidth 
allocation  for  legitimate  capability  requests  based  on  domain 
contamination. 

II.  Background  and  Related  Work 

Lack  of  source  address  authenticity  in  the  Internet  Protocol 
(IP)  enables  attackers  to  forge  the  source  addresses,  and 
hence  complicates/prevents  address-based  accounting  during 
link  flooding  attacks.  As  a  way  to  add  authenticity  to  individ¬ 
ual  packets,  capability  solutions  [4]-[6]  have  been  proposed. 
Generally,  a  network-layer  capability  protocol  requires  a  hand¬ 
shake  between  a  client  and  a  server,  and  during  that  phase, 
routers  on  the  forwarding  path  collectively  issue  a  connection 
capability;  i.e.,  a  series  of  router  capabilities  on  the  path.  A 
router’s  capability,  which  is  generated  by  hashing  the  source 
and  destination  IP  address  with  the  router’s  secret  key,  is 
cryptographically  secure  against  forgeries  since  the  router  key 
is  unavailable  to  an  adversary. 

However,  the  capability  request  protocol  is  still  vulnerable 
to  flooding  (DoC)  attacks  [7],  That  is,  flooding  with  capability 
requests,  which  cannot  be  prioritized,  successfully  denies  a 
legitimate  access  to  a  congested  link.  Portcullis  [8]  proposes 
a  puzzle-based  mechanism  that  provides  a  guaranteed  link 
access  during  a  flooding  (DoC)  attack.  Though  useful,  the 
guarantee  is  linearly  dependent  on  the  number  of  hots,  which 
can  be  substantial  (e.g.,  the  size  of  a  botnet  easily  exceeds 
1  million  hots  [12]).  Alternatively,  TVA’s  implementation  of 
fair  queueing  on  incoming  traffic  paths  (i.e.,  hierarchical  fair 
queueing)  [6],  which  equally  assigns  queues  to  directly  con¬ 
nected  links  and  splits  the  queues  recursively  for  distant  links, 
places  legitimate  accesses  of  remote  domains  at  a  significant 
disadvantage  since  it  provides  fair  service  to  the  same  level 
of  queues  (i.e.,  sub-queues  split  from  a  queue).  More  so¬ 
phisticated  application-layer  solutions  (e.g.,  CAPTCHA  [13]) 
that  attempt  to  distinguish  between  human-  and  machine- 
initiated  traffic  to  prevent  flooding  attacks  are  impractical  at 
the  network-link  level. 

Attempts  to  block  suspicious  traffic  upstream  of  a  congested 
router  by  installing  filters  close  to,  or  at,  the  domains  originat¬ 
ing  attacks  could  protect  legitimate  flows  that  are  independent 
of  attacks.  To  be  effective,  cooperative  filtering  would  require 
incentives  that  scale  with  the  number  of  participating  domains 
-  a  tall  order  since  it  depends  on  the  attack  itself.  Furthermore, 
with  only  local  information  (the  traffic  rate  of  incoming  links), 
a  router  cannot  easily  identify  the  links  (or  upstream  links) 
that  are  responsible  for  the  congestion;  and  even  if  such 
information  is  available,  an  adversary  can  launch  a  timed 


attack  where  different  groups  of  zombies/bots  issue  targeted 
requests  by  exploiting  the  time  delay  required  for  installing 
and  releasing  filters  at  upstream  routers  (e.g.,  on-off  and  rolling 
attacks). 

III.  Design  Overview 

In  this  section,  we  present  an  overview  of  our  defense 
scheme  by  describing  the  basic  mechanisms. 

A.  Threat 

The  main  threat  we  deal  with  in  this  work  is  a  link 
flooding  attack  on  the  capability-setup  channel,  where  attack 
sources  collaboratively  exhaust  the  link  bandwidth  allocated 
for  connection  establishment.  We  assume  that  both  hosts  and 
routers  can  be  compromised  and  send/forward  attack  traffic. 
Compromised  hosts  are  able  to  both  flood  a  target  link  with 
capability  request  packets  and  disturb  the  path  identification 
mechanism  at  a  remote  router  by  manipulating  the  header 
reserved  for  that  purpose  (viz..  Section  IV-A).  Compromised 
routers  can  disturb  path  identification  by  either  forwarding 
packets  that  contain  false  path-markings  or  adding  invalid 
path-markings  to  the  packets  they  forward. 

B.  Path  Identification 

In  this  work,  we  consider  routers  that  mark  packets  with 
path  information.  These  path-markings  create  an  unspoofable 
origin  identifier  because  they  cannot  be  controlled  by  end- 
hosts.1  In  addition,  path-markings  enable  remote  routers  to 
construct  a  traffic  tree.  The  domain  connectivity  revealed  in 
the  traffic  tree  helps  identify  the  distribution  of  attack  sources 
in  specified  locales  to  which  bandwidth  allocation  will  be 
restricted  (viz..  Section  VI). 

The  basic  concept  of  route  construction  is  similar  to  that 
of  previous  schemes  [6],  [14],  yet  we  use  a  packet’s  AS 
(Autonomous  System)  path  as  a  domain  identifier  for  several 
reasons.  First,  a  packet’s  AS-path,  which  is  primarily  deter¬ 
mined  by  the  number  of  AS  hops  (AS-path  length)  to  the 
destination  in  the  inter-domain  routing  protocol  (e.g.,  BGP- 
4),  is  more  stable  than  the  routing  path  within  an  AS  that 
may  frequently  change  during  flooding  attacks  due  to  link 
state  changes  (e.g.,  link  failure).  We  use  the  AS-path  of  a 
packet  as  a  persistent  domain  identifier.  Second,  a  packet’s 
AS-path  can  be  constructed  by  the  egress  router  of  the  source 
domain  since  the  router  contains  the  AS-path  information 
of  destination  addresses  in  its  routing  table.  This  source- 
constructible  domain  identifier  eliminates  deployment  issues 
that  plagued  previous  path-marking  schemes  especially  in  the 
Internet  core,  and  hence  enables  independent  adoption  of  the 
marking  scheme  at  the  Internet  border  (e.g.,  provider/stub 
domains).  We  envision  that  prioritizing  requests  originating 
from  path-marking  domains  would  encourage  early  adoption 
of  the  marking  scheme. 

'IP  source  routing  may  allow  a  client  to  select  a  path  to  a  destination. 
However,  strict  and  loose  source  routing  are  usually  blocked  at  routers  to 
avoid  the  associated  processing  overhead. 
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Fig.  1:  Path  Identifier.  R'A  is  the  egress  router  of  AS4  and 
are  the  ingress  routers  of  AS3,  AS2,  AS1  respec¬ 
tively.  f?4  writes  the  path-identifier  to  the  packet  heading  to 
server  S  in  AS1,  and  ingress  routers  on  the  path  validate  the 
markings.  C:j  is  the  capability  issued  by  Rj.  Each  ingress 
router  can  validate  the  shaded  part  of  the  markings. 


We  define  a  packet’s  AS-path  to  its  destination  as  the  path- 
identifier  of  the  packet,  and  present  it  in  the  order  of  markings: 
from  the  origin  to  the  destination.  Thus,  as  illustrated  in 
Fig.  1,  the  path-identifier  seen  at  a  congested  router  in  A Sj 
is  {AS4,  AS3,  AS2,  ASi}.  We  implement  this  path-identifier 
in  a  shim  header  so  that  only  upgraded  routers  interpret  it. 
Throughout  this  paper,  we  denote  the  path-identifier  whose 
markings  start  with  A  .S',  by  S,  and  the  BGP  speaker  of  AS,  by 
Ri.  In  Section  IV,  we  present  a  mechanism  that  protects  path- 
identifiers  from  potential  attacks  (e.g.,  spoofing  and  replay 
attacks). 

C.  Link  Access  Guarantees 

In  defending  against  DoC  attacks,  our  goal  is  to  provide 
precise  guarantees  of  link  access  to  capability  requests,  where 
the  guarantees  are  provided  in  a  domain  basis  to  confine  the 
effects  of  attacks  within  the  domains  originating  attack  traffic. 
This  goal  is  achieved  by  a  new  fair  queueing  mechanism  that 
allocates  separate  buffer  slots  to  individual  domains.  And,  the 
guarantees  provided  by  the  queueing  mechanism  are  optimized 
to  favor  the  requests  from  uncontaminated  domains  by  hots, 
using  a  path  aggregation  mechanism. 

1)  Fair  Queueing  Revisited:  The  use  of  a  fair  queueing 
scheme  for  link-access  guarantees  is  intended  to  maximize 
service  on  the  legitimate  capability  requests.  Fair  queueing 
schemes,  if  they  can  assign  separate  queues  to  individual 
path-identifiers,  could  provide  fair  bandwidth  to  the  path- 
identifiers  without  link  under-utilization  (which  could  occur 
whenever  strict  bandwidth  reservation  is  made  to  individual 
path-identifiers).  However,  when  the  spatio-temporal  dynamics 
of  domains  contributing  to  congestion  (e.g.,  time-varying  pat¬ 
terns  of  domain  traffic)  are  considered,  such  queue  assignment 
in  a  limited  buffer  is  a  challenging  problem.  For  example, 
for  a  fixed  buffer  size,  under-provisioning  of  the  number  of 
queues  in  a  specific  time  period  may  fail  to  provide  link- 
access  guarantees  to  path-identifiers  due  to  potential  queue 
collisions  among  different  path-identifiers.  In  contrast,  over¬ 
provisioning  of  it  would  decrease  the  length  of  individual 
queues,  hence  weaken  the  guarantees  (viz..  Section  V).  Thus, 
we  aim  to  design  a  fair  queueing  scheme  that  assigns  a  unique 
queue  to  each  path-identifier  and  adjusts  the  individual  queue 


lengths  to  fit  the  buffer  size  for  link-access  guarantees  and 
their  enhancement  -  a  desired  goal. 

While  a  variety  of  traditional  fair  queueing  schemes  focus 
on  the  bandwidth  fairness  of  flows  in  different  queues  that 
contain  various  sizes  of  packets,  the  Stochastic  Fair  Queueing 
(SFQ)  scheme  [15]  offers  queue  length  fairness  via  a  buffer 
stealing  mechanism,  whereby  a  packet  that  finds  a  full  buffer 
on  its  arrival  would  steal  a  buffer-slot  from  the  longest  queue. 
We  note  that  the  fixed  size  capability  request  packet  would 
eliminate  the  intrinsic  bandwidth  unfairness  of  SFQ  in  the 
presence  of  different  packet  sizes  [16].  Based  on  the  buffer¬ 
stealing  idea,  we  improve  SFQ  in  two  respects.  First,  we 
avoid  queue  collisions  among  path-identifiers  that  are  allowed 
but  fairly  distributed  via  stochastic  queue  assignment  in  SFQ. 
Second,  we  make  queue  management  operations  (e.g.,  queue 
assignment  and  buffer-slot  preemption)  scalable  and  efficient 
to  easily  adapt  our  scheme  to  diverse  operating  environments 
(e.g.,  link  capacity,  the  number  of  required  queues).  Those 
improvements  are  made  via  a  dynamic  virtual  queueing  mech¬ 
anism  presented  in  Section  V. 

2)  Path  Aggregation :  As  more  domains  are  contaminated 
by  attack  sources,  link-access  guarantees  provided  by  our 
queueing  scheme  become  weak  as  both  available  link  band¬ 
width  and  buffer-slots  to  each  path-identifier  decrease.  This 
undesirable  dependency  of  guarantees  on  attack  dispersion  is 
unavoidable  as  long  as  all  path-identifiers  are  equally  treated. 
Protecting  requests  of  uncontaminated  domains  essentially 
needs  a  differential  treatment  of  path-identifiers  based  on  the 
proportion  of  legitimate  requests  they  deliver.  Though  the  le¬ 
gitimacy  of  individual  capability  requests  cannot  be  validated, 
the  proportion  of  legitimate  requests  in  a  set  of  requests  can  be 
estimated  using  a  couple  of  flow  conformance  tests.  These  tests 
consist  of  (1)  a  test  on  bandwidth  conformance  that  represents 
the  aggressiveness  of  requests  and  (2)  a  test  on  protocol 
conformance  that  indicates  the  legitimacy  of  authorized  flows 
in  various  respects  (viz..  Section  VI-A). 

Conformance  tests  performed  on  each  path-identifier  en¬ 
ables  differential  assignment  of  bandwidth  to  path-identifiers 
that  maximizes  service  to  legitimate  requests  at  the  flooded 
link.  Yet,  in  the  presence  of  a  large  number  of  attack  domains, 
such  assignment  cannot  easily  be  made,  nor  can  it  tolerate 
imprecise  measurement  of  domain  contamination.  Instead,  we 
aggregate  the  path-identifiers  of  a  highly  contaminated  locale 
and  assign  a  new  path-identifier  to  them.  This,  in  effect, 
limits  both  available  bandwidth  and  buffer  space  for  those 
path-identifiers.  We  define  this  path  aggregation  problem  as 
a  constrained  optimization  problem  and  provide  an  efficient 
solution  in  Section  VI-C. 

IV.  Path  Identification 

In  this  section,  we  first  describe  the  basic  path  identification 
mechanism,  and  then  enhance  the  mechanism  with  additional 
security  features. 

The  basic  path  identification  mechanism  works  as  follows. 
When  the  egress  router  of  a  domain  (i.e.,  the  BGP  speaker) 
forwards  a  packet  that  originates  from  its  domain,  it  writes 
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the  path-identifier  (i.e.,  the  AS-path  to  the  destination)  in  the 
packet’s  header.  AS  ingress  routers  of  the  packet  forwarding 
path  validate  the  authenticity  of  a  fraction  of  this  path- 
identifier  starting  with  the  upstream  AS  that  forwarded  the 
packet  and  ending  with  the  destination  AS  as  shown  in  Fig. 
1.  Whenever  AS  ingress  routers  receive  a  non-marked  packet, 
they  write  their  own  path-markings:  the  AS-path  from  their 
upstream  AS  to  the  destination  AS. 

As  remote  domains  can  validate  only  a  part  of  path- 
markings,  attack  sources  in  non-path-marking  domains  may 
spoof  path-identifiers  unless  the  marking  scheme  (which 
includes  the  verification  function)  is  sufficiently  deployed. 
Even  under  wide  deployment  of  the  marking  scheme,  the 
authenticity  of  path-identifiers  verified  at  a  domain  cannot  be 
delegated  to  the  downstream  domains  without  a  strong  trust 
relationship  established  between  those  domains.  This  makes 
any  manipulation  of  path-identifiers  by  compromised  routers 
undetectable  at  remote  routers.  To  protect  path-identifiers  from 
these  attacks  (i.e.,  spoofing  and  replay  attacks),  we  present  a 
secure  path  identification  mechanism  below. 

A.  Unspoofable  path-identifier 

We  first  introduce  potential  attacks  that  disturb  path- 
identification  at  remote  routers  and  present  our  defense  mech¬ 
anism  against  those  attacks. 

Let  { ASn , . . . ,  AS2,  ASi}  be  the  path-identifier  seen  at  the 
congested  router,  and  let  *  and  fi  be  any  valid  and  forged 
sequence  of  markings  respectively.  If  the  domains  up  to  ASi 
are  unprotected  by  our  path  identification  (which  includes 
both  path  marking  and  verification)  scheme,  both  compro¬ 
mised  sources  in  non-path-marking  domains  and  compromised 
routers  in  ASk  can  forge  a  path-identifier  as  {#,  ASi,*,  ASi}. 

In  principle,  a  router  can  authenticate  its  path-markings 
to  other  routers  by  adding  a  digital  signature  to  the  path- 
markings.  However,  adding  a  different  digital  signature  to 
every  packet  would  impose  significant  computational  overhead 
for  both  its  generation  and  verification.  Moreover,  a  per-packet 
signature,  if  employed,  could  be  exploited  by  attackers  to  ex¬ 
haust  routers’  computational  resource  (e.g.,  by  flooding  small- 
size  packets).  To  reduce  authentication  overhead,  we  design  an 
efficient  path-identifier  authentication  mechanism,  where  each 
domain  pre -distributes  its  domain-authenticator  and  uses  it  to 
authenticate  its  path-markings.  One  fundamental  assumption 
for  implementing  this  mechanism  is  that  any  protected  AS 
has  a  public-private  key  pair  certified  by  a  trusted  certificate 
authority  (e.g.,  ICANN  [17]). 

1)  Authenticator  Distribution:  When  a  BGP  speaker  ad¬ 
vertises  an  address  prefix  that  belongs  to  its  domain,  the  BGP 
speaker  adds  an  origin  authentication  number  (OAN),  which  is 
unique  in  its  domain  and  is  digitally  signed  with  the  domain’s 
private-key,  to  its  route  advertisement.  All  BGP  routers  that 
receive  this  route  advertisement  authenticate  the  OAN  using 
the  origin  AS’  public -key  and  hold  the  authenticated  ASN 
(AS  Number)-OAN  pair  for  later  path-identifier  authentication. 


Path  Identifier  . *■  . *■  - ►  . *  \ . ► 

from  an  Unprotected  Domain  i  A^3>  AS,.  OAN2  ,  AS, } 

(AS4,  AS3  are  non-marking  domains) 

©Forged  . *-  . *-x 

Path  Identifier  (■ ASr-  OAN#-  *■  ASv  AS2-  ASi ) 

(AS3  is  a  non-marking  domain) 

Fig.  2:  Path-identifier  Authentication.  ®  Path-identifier  written 
at  the  packet’s  origin  (AS4)  can  be  validated  at  any  domain 
(AS2,  ASI)  in  the  presence  of  a  non-marking  domain(s)  (AS3) 
on  the  packet’s  forwarding  path.  (D  If  the  origin  AS  does  not 
participate  in  path-marking,  the  first  participant  (AS2)  writes 
its  markings  and  adds  the  incoming  AS  number  (AS3)  to 
distinguish  the  packets  it  forwards  from  the  ones  originating 
from  it.  ©  An  invalid  ASN-OAN  pair  (denoted  by  fi)  can  be 
detected  and  filtered. 

Since  the  number  of  ASN-OAN  pairs  is  at  most  65,5352,  the 
space  requirement  for  this  validation  is  bounded,  i.e.,  262KB 
for  4-Byte  OANs. 

2)  Origin  Authentication:  The  BGP  speaker  of  a  packet’s 
domain  of  origin  writes  its  ASN-OAN  pair  followed  by  the 
AS-path  to  the  destination  in  the  path-identifier  header.  Fig. 
2  illustrates  the  cases  for  origin  authentication  under  different 
deployment  scenarios  of  the  marking  scheme.  Whenever  no 
path-identifier  is  present  in  a  packet,  the  ingress  router  of 
a  marking  AS  constructs  path-markings  with  its  own  ASN- 
OAN  pair  (viz.,  ®  in  Fig.  2).  On  receiving  path-identifiers 
constructed  as  such,  the  AS  ingress  routers  on  the  later 
path  validate  the  origin’s  OAN  and  the  partial  AS-path  as 
discussed  above.3  In  this  way,  the  routers  on  the  way  to,  or 
at  the  destination  AS  can  identify  any  forged  path-markings 
by  adversaries  even  in  the  presence  of  consecutive  non-path- 
marking  ASs  on  the  path  and  filter  packets  carrying  those 
forged  path-markings. 

While  a  compromised  router  in  ,4 .S’,  can  still  forge 
two  valid  types  of  path-identifiers  such  as  {ASi,  OAN. /%*} 
and  {fi=,  ASi,  OAN^ ,  *},  their  effects  can  be  limited  to  at 
most  those  of  two  path-identifiers  by  discarding  the  non- 
authenticated  prefixes  of  path-identifiers. 

B.  Preventing  Replay  Attacks 

Under  partial  deployment  of  our  path-marking  scheme, 
attack  sources  in  non-path-marking  domains  may  forge  path- 
identifiers  ending  with  authenticated  ASN-OAN  pairs  (since 
ASN-OAN  pairs  are  not  confidential  to  end-hosts)  and  use 
them  in  flooding  a  target  link.  Such  replay  attacks  would 

2As  of  2010.  the  number  of  advertised  ASNs  is  about  35,000  out  of  65.535 
(16-bit)  possible  ASNs  [18] 

3 For  path  validation,  routers  need  to  keep  AS-path  information  (from  next 
hop  to  the  destination  AS)  in  their  forwarding  table  (i.e.,  FIB).  However,  this 
would  not  require  much  space  since  the  average  number  of  ASs  a  packet 
traverses  from  its  origin  to  destination  is  four. 
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significantly  affect  the  requests  from  path-marking  domains 
and  hence  prevent  those  domains  from  receiving  incentive  for 
early  adoption  of  the  path-marking  scheme. 

Path-marking  routers  counter  replay  attacks  via  fast  OAN 
renewals,  which  are  efficiently  implemented  using  a  reverse 
hash  chain  [8].  Let  OAN®  be  the  initial  OAN  of  ASi.  ASi 
constructs  a  hash  chain  of  OANs  by  repeatedly  hashing 
OAN®  with  a  cryptographic  hash  function  (i.e.,  OAN*  = 
Hash(OAN*-i  ||ASi||fc  —  1)  for  1  <  k  <  M),  and  distributes 
OAN*1  when  advertising  a  route.  We  engage  ASi  and  k  —  1 
in  generating  OAN  to  produce  distinct  OAN  sequences  for 
different  ASs  and  initial  OANs  respectively.  A  BGP  speaker 
uses  OAN*  during  a  predefined  interval;  and  changes  it  to 
OAN*-1  in  the  next  interval.  Hence,  without  breaking  the 
hash  function,  an  attacker  cannot  construct  the  valid  sequence 
of  OAN*s  to  be  used.  A  (ingress)  router  can  authenticate 
OAN*  by  computing  Hash(OAN*||AS'i||/c)  and  comparing  it 
with  OAN*+1.  This  OAN  authentication  is  performed  only 
once  for  every  OAN  renewal.  Once  OAN*  is  used,  OAN*+  1 
is  invalidated.  Note  that  if  the  OAN  renewal  period  is  less  than 
the  time  required  for  replaying  OANs,  replay  attacks  will  be 
effectively  prevented.  The  length  of  a  OAN  hash  chain  (M) 
is  determined  in  consideration  of  the  OAN  renewal  period 
to  avoid  frequent  OAN  distribution.  For  example,  if  a  20-bit 
sequence  number  (M  «  1  million)  and  500ms  OAN  renewal 
period  are  used,  a  domain  needs  to  advertise  its  OAN  once  in 
every  six  days.  We  also  note  that  routers  in  different  domains 
need  not  be  time-synchronized  as  an  OAN  carries  its  sequence 
number  that  is  specific  to  the  domain. 

V.  Dynamic  Virtual  Queueing 

In  this  section,  we  describe  a  dynamic  virtual  queue¬ 
ing  mechanism  for  link-access  guarantees  on  path-identifiers. 
Our  dynamic  virtual  queueing  mechanism  is  designed  to 
assign  a  separate  queue  to  active  path-identifiers  and  pro¬ 
vide  queue  length  fairness  to  the  path-identifiers  in  a  raid- 
mat  manner.  For  these  purposes,  a  router  manages  virtual 
queues  rather  than  physically  separate  queues,  that  are  dis¬ 
tinguished  by  the  path-identifier  (Si),  its  count  at  time  t 
(NSi  (£))  and  packet  location  (memory  address)  (AsJ  in 
the  buffer;  i.e.,  (Si,  Nst(t),  Ast).  Given  these  tuples  and 
the  buffer  size  Lq,  queue-length  fairness  on  path-identifiers 
(min maxsiG5  NSi (t)  for  Xs.eS  NSi(t)  =  Lq)  can  be  de¬ 
scribed  by  the  following  buffer-slot  preemption  policy.  If  a 
packet  finds  the  buffer  full  on  its  arrival,  it  preempts  a  buffer- 
slot  from  the  longest  virtual  queue.  If  the  arrived  packet  be¬ 
longs  to  the  longest  virtual  queue,  or  its  preemption  produces 
another  longest  virtual  queue,  the  packet  would  be  dropped. 
This  preemption  policy  ensures  guaranteed  buffer-slots  to 
each  path-identifier  if  the  number  of  buffered  path-identifiers 
is  bounded.  We  assume  that  the  number  of  buffered  path- 
identifiers  can  be  statistically  or  deterministically  bounded  at 
a  router  (i.e.,  the  minimum  bandwidth  to  a  legitimate  path- 
identifier  can  be  determined). 


A.  Implementing  Buffer-slot  Preemption 

For  efficient  and  scalable  accounting  of  virtual  queue 
lengths,  we  use  a  new  Counting  Bloom  Filter  (CBF)  that  holds 
the  number  of  buffer-slots  occupied  by  path-identifiers  and 
provides  lookup,  add  and  remove  operations  in  0(1)  time  (a 
modified  version  of  CBF  [19]).  CBF  consists  of  to  counter 
arrays  of  size  2b  (ai,  a2, ... ,  am)  and  to  hash  functions  of  fa- 
bit  output  (Hi,  H2,  . . . ,  Hm),  where  a.,  is  associated  with  Hi. 
For  an  input  to  CBF,  each  hash  function  maps  its  output  to 
the  corresponding  array  position;  e.g.,  a,i[Hi(S i)]  corresponds 
to  the  input  Si  for  1  <  i  <  m. 

Path-identifier  accounting  in  CBF  works  as  follows.  All 
array  values  are  initialized  to  zero.  When  a  packet  is  added 
to  the  buffer,  its  path-identifier  is  fed  into  CBF.  Then,  CBF 
locates  to  array  positions  for  the  path-identifier,  and  in¬ 
creases  the  corresponding  array  values.  The  same  applies 
to  a  packet  removal  from  the  buffer,  yet  the  counter  val¬ 
ues  are  decreased.  In  this  scheme,  the  limited  hash  out¬ 
put  size  (i.e.,  2b)  could  cause  hash-output  collisions  among 
path-identifiers.  Such  collisions  would  make  corresponding 
array  values  increased  by  multiple  path-identifiers,  hence 
corrupted.  However,  unless  all  of  the  array  values  associ¬ 
ated  with  Si  are  corrupted,  we  can  compute  the  count  of 
buffered  Sf  s  by  taking  the  minimum  of  the  array  values; 
i.e.,  mm{a1[H1(Si)],a2[H2(Si)],  ...,am[Hm(Si)]}.  Since  the 
probability  that  all  to  array  values  of  a  path-identifier  are 
corrupted  is  (1  —  (1  —  (l/2b))l5l)m  for  |<S|  buffered  path- 
identifiers  [20],  we  can  make  the  probability  negligible  by 
increasing  the  array  size  (2b)  or  the  number  of  arrays  (to). 

Path-identifiers  that  occupy  more  buffer  slots  than  the  guar¬ 
anteed  amount  (i.e.,  jgj- j )  should  be  kept  track  of  for  possible 
preemption.  To  this  end,  a  router  maintains  a  table,  named 
Path-Identifier  Record  (PIR),  that  holds  over-buffered  path- 
identifiers,  their  counts  and  corresponding  packet  locations. 
In  PIR,  a  path-identifier  is  stored  as  the  concatenation  of  its 
77i  hash  outputs,  defined  as  “ path-signature .”  This  enables  fast 
buffer-slot  preemption  because  the  preempted  packet’s  path- 
signature  would  directly  locate  array  values  that  need  to  be 
decreased  in  CBF. 

B.  Probabilistic  Guarantees 

If  packet  arrivals  carrying  path-identifier  ,S)  are  modeled 
as  a  Poisson  process  and  k  buffer-slots  are  allocated  to  S,, 
the  probabilistic  lower  bound  of  Sf  s  link  access  (denoted  by 
£/(|<S|,  k,  Si))  is  provided  as  follows. 

g(\S\,k,Si)=  (V.l) 


V^fc-1  (kpSif  -kPs. 

2->j=o  j\  e 


j=k 


LU  K-1) 


PSi  <  1 
PSi  >  1 


where  A s4  is  the  request  rate  of  St,  pst  =  t^le 

bandwidth  utilization  of  5),  and  Qc  =  Xj=o  {'kpsp  e~*psu 
We  justify  the  Poisson  arrival  model  of  capability  requests 
with  two  reasons:  (1)  during  the  short  interval  that  the  guar¬ 
antees  are  defined  (i.e.,  the  maximum  queueing  delay  of  a 
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router  A q),  the  capability  requests  by  different  clients  can  be 
assumed  independent;  and  (2)  a  single  capability  can  be  used 
for  multiple  correlated  sessions  that  need  to  be  established  for 
most  Web  applications  (that  is,  multiple  correlated  capability 
requests  are  unnecessary).  Under  this  model,  if  f>s,  <  1, 
an  arrival  of  S,  is  guaranteed  to  be  serviced  if  less  than  k 
arrivals  of  Si  has  occurred  in  Aq.  If  ps,  >  1,  an  arrival 
of  Si  is  guaranteed  to  be  serviced  only  if  its  queue  length 
is  less  than  k.  Thus,  Eq.  (V.l)  can  be  easily  proved.  The 
probabilistic  guarantee  of  Si  s  link-access  is  provided  by 
setting  \S\  =  |<S|max  and  ^  —  L|su..J-  We  provide  a  full 
proof  in  Appendix  A. 

C.  Resource  Requirements 

1 )  Request  Packet  Buffer:  A  large  buffer  ( Lq )  for  capability 
request  packets  is  preferable  since  it  would  not  only  improve 
the  guarantees  (viz.,  Eq.  (V.l))  but  also  handle  the  requests 
from  spontaneously  created,  short-lived  paths.  However,  the 
size  of  the  buffer  should  be  bounded  in  consideration  of  the 
maximum  allowed  queueing  delay  to  avoid  unnecessary  retries 
from  flow  sources.  For  example,  if  we  assume  0.25  second 
maximum  queueing  delay  and  128B4  request  packet  size,  for 
a  2.5  Gbps  link  5,  a  router  requires  4.0  MB  buffer  (when  5% 
of  link  bandwidth  is  allocated  for  capability  requests  [6]),  and 
with  which  it  can  provide  8  guaranteed  buffer  slots  up  to  3.75K 
path-identifiers. 

2)  Path-Identifier  Accounting:  The  memory  requirement 

for  CBF  is  determined  by  a  target  false-positive  ratio.  The  false 
positive  ratio  of  a  CBF  is  determined  by  (l  —  (1  —  ss 

(  _  JS[  \  m  /  Lq  \  m 

II  —  e  a6)  =11  —  e  k-26  1  since  Lq  =  k  ■  |«S|.  Hence, 

for  a  desired  false  positive  ratio,  the  size  of  each  counter  array 
in  CBF,  which  is  same  as  the  size  of  hash  output  (2b),  is 
linear  with  the  buffer  size  (i.e.,  0(Lq)).  For  example,  a  CBF 
with  8  hash  functions  of  14-bit  outputs  would  require  8  x  214 
(hash  outputs)  x28  (counter)  =  131KB  memory  space  while 
producing  a  reasonably  low  false  positive  ratio  (3.07  x  10“4%) 
in  the  presence  of  3.75K  path-identifiers. 

P1R  holds  the  path-identifiers  whose  count  exceeds 
for  possible  preemption.  Hence,  the  memory  requirement  is 
bounded  by  Lq/ (k+l)x  (16B  (path-signature)  +  4B  (address 
pointer))  (e.g.,  60KB  for  the  above  example),  since  the  number 
of  path-signatures  in  PIR  has  its  maximum  when  all  path- 
identifiers  have  k  + 1  packets  in  the  buffer.  Hence,  the  memory 
requirement  for  both  CBF  and  PIR  is  (-){Lq). 

VI.  Path  Aggregation 

In  this  section,  we  first  describe  a  mechanism  for  estimat¬ 
ing  the  proportion  of  legitimate  requests  of  individual  path- 
identifiers,  and  then,  a  path-identifier  aggregation  mechanism 
that  maximizes  the  goodput  ratio,  defined  as  the  proportion 
of  legitimate  requests  in  all  serviced  requests,  at  a  congested 
link.  Aggregating  path-identifiers  produces  an  optimal  traffic 

4We  reserve  88B  shim  header:  40B  for  path-identifiers  (up  to  10  AS 
markings),  8B  for  an  origin  authenticator  and  40B  for  5  capabilities. 

s2.5Gbps  (OC  48)  links  are  widely  used  for  ISP’s  backbone  links. 


tree  to  which  applying  our  queueing  mechanism  maximizes 
goodput  ratio  at  the  congested  link. 

A.  Goodput  Estimation 

In  absence  of  any  other  useful  information  regarding  the 
origin  of  attack  sources  and  the  path-identifiers  assigned  to 
them,  the  request  rate  of  path-identifier  St  (\s,  )  can  be  used 
as  a  unique  measure  for  estimating  the  goodput  ratio  of  ,5',. 
We  define  the  bandwidth  conformance  of  path-identifier  Si 

as  min{l,  ^ - }  to  represent  how  the  request  rate  of  Si 

conforms  to  the  assigned  bandwidth  to  it,  and  denote  it  by 

SRt>  Le-  £Ri  =  min{l,  - }  (recall  that  S)  is  assigned 

to  all  packets  originating  from  Rf). 

Additionally,  we  estimate  domain  contamination  more  ac¬ 
curately  by  identifying  the  following  attack  flows. 

Unauthorized  flows:  A  capability  issued  by  a  router  during 
the  connection  establishment  phase  of  a  flow  must  be  used 
at  least  once  for  actual  data  transmission  unless  it  is  denied 
afterward  by  application  services,  firewalls  or  IDSs.  Thus,  the 
proportion  of  unused  capabilities  could  effectively  measure  do¬ 
main  contamination  as  it  reflects  the  strong  flow  authorization 
results  applied  at  the  network  ends. 

High-rate  flows:  Flows  that  send  high-rate  traffic  using 
valid  capabilities  would  exhibit  high  packet-drop  rates  as 
indicated  in  [21],  Hence,  if  a  router  implements  per-domain 
bandwidth  control,6  high-rate  attack  flows  within  a  domain 
can  be  identified  by  capability  drop  rates  [22], 

High-fanout  sources:  If  sources  are  allowed  to  establish 
an  unlimited  number  of  connections  with  other  destinations 
through  the  congested  link,  they  can  deplete  link’s  bandwidth 
with  a  large  number  of  legitimate-looking  flows  [23].  This 
insidious  attack  will  be  prevented  if  a  router  limits  the  number 
of  per-source  capabilities  as  follows. 

Fet  Cfs  d  be  the  capability  for  a  flow  /S  £;  between  a  source 
s  and  a  destination  d.  Cfs  d  consists  of  two  parts,  namely 
Cf  .  =  C'°f  II  Cl  .  Here,  C ,  is  defined  as: 

Js,d  Js.d 11  Js,d  Js,d 

C°fsd  =  Hash(IP3,IPd,/4) 

C}s  d  =  Hash(IPs,  f(IPd),  Kr) 

where  IPS  and  IP,/  are  the  source  and  destination  IP  addresses, 
K/  and  K/  are  the  router’s  secret  keys,  and  f(-)  is  a  function 
whose  output  is  randomly  uniform  on  [0,  nmax-l]. 

Cj  d  provides  identifier  authenticity  to  flows  [5],  [6],  and 
Cj  d  restricts  the  number  of  per-source  capabilities  to  nmax 
by  taking  f(IP^)  as  a  hash  input.  If  Cj  d  is  used  for  estimat¬ 
ing  flow  bandwidth,  flows  of  high-fanout  sources  would  be 
aggregated  and  turn  into  high-rate  flows. 

The  above  attack-flow  identification  measures  help  estimate 
the  proportion  of  legitimate  flows  in  flows  carrying  Si,  which 
we  define  as  the  protocol  conformance  of  Si  and  denote  by 


6F1ows  in  different  domains  could  exhibit  different  drop  rates  due  to 
different  RTTs. 
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Based  on  the  bandwidth  and  protocol  conformances,  the 
conformance  estimate  £R.  of  Si,  representing  the  estimate  of 
Sfs  goodput  ratio,  is  defined  as: 

l*S|macc 

s*  =  e  ^  (Si 

Sr,  ( tj )  =  (1  -  a)£Ri  +  a£Ri  (tj-  1) 

where  7  and  a  are  the  weighting  coefficients. 

The  conformance  estimate  of  Si:  is  the  weighted  average 
of  the  bandwidth  conformance  and  the  protocol  conformance, 
where  the  weighting  factor  exponentially  favors  the  protocol 
conformance  as  sufficient  requests  have  been  made.  In  this 
way,  we  prevent  a  domain’s  conformance  estimate  from  being 
highly  biased  by  its  (low)  request-rate;  e.g.,  unexpected  packet 
drops  of  a  low-rate  path-identifier  would  produce  a  very 
low  protocol  conformance  for  the  corresponding  domain.  We 
determine  £Ri  at  time  tj  by  taking  the  moving  average  of 
£ Ris,  and  update  it  once  in  every  aggregation  period  (A as9); 

i.e.,  tj  tj—  1  A  agg- 
B.  Aggregation  Problem 

For  path  aggregation,  the  congested  router  R0  builds  the 
traffic  tree  7 r0  using  the  path  identifiers  carried  in  the  active 
flows  and  decomposes  Tr0  into  a  legitimate  tree  TRq  and 
an  attack  tree  TRq.  TRo  is  constructed  with  legitimate  path- 
identifiers  that  have  a  higher  conformance  estimate  than  a 
certain  threshold  ( £th ),  and  TRq  is  constructed  with  the  other 
(non-legitimate)  path-identifiers.  Then,  the  router  constructs  a 
new  traffic  tree  Tf0  by  merging  those  two  trees  at  the  root 
(i.e.,  the  disjoint  union  of  TRo  and  TR).  Path  aggregation  is 
performed  on  this  new  traffic  tree  T'Rq,  so  that  legitimate  paths 
would  never  be  aggregated  with  attack  paths. 

The  congested  router  starts  path  aggregation  from  neigh¬ 
boring  domains  (i.e.,  domains  with  longest  suffix-matching 
path-identifiers)  to  localize  attack  effects,  and  proceeds  with 
aggregation  until  a  desired  number  of  path  reductions  are 
made  (viz.,  Eq.  (VI.  1)).  Aggregation  is  performed  with  respect 
to  the  conformance  estimate  of  each  path  since  link-access 
guarantees  should  not  be  biased  by  the  path’s  request  rate. 
Hence,  if  the  number  of  access-guaranteed  path-identifiers 
is  |<S|max,  the  path  aggregation  problem  is  to  construct  an 
optimal  tree  which  has  \S\max  distinct  paths  and  to  which 
providing  link-access  guarantees  maximizes  goodput  ratio  at 
the  congested  link.  This  can  be  defined  as  a  constrained 
optimization  problem  below. 

Let  77  be  the  set  of  all  nodes  in  TRo,  and  77,  be  the  set  of 
leaf  nodes  of  a  subtree  rooted  at  77j  £  T'Rq  (i.e.,  TRi).  Then, 
the  optimization  problem  is  defined  as: 

max  O(Tr0)  =  max  ^  £Rj  (VI.  1) 

Ri  £1Z  '  Rj  £1Zi 

subject  to  ^2  tRi  <  \S\max  and  |_J  77i  =  770 

RiGtl  RiGTZ 


where  iRi  equals  1,  if  paths  are  aggregated  at  R.,,  and  0, 
otherwise.  For  a  non-aggregated  path,  IRi  is  1  at  the  leaf  node. 
Since  fffseS  is  the  number  of  path  identifiers  seen  at  Rq, 
it  should  be  bounded  by  |<S|max. 

In  the  above  equation,  aggregation  at  Ri  decreases  the 
total  conformance  estimate  by  1  ^2R  eR.  £Rj  ■  We  define 
this  value  as  the  aggregation  cost  and  denote  it  by  C'v4(i?,); 
i.e.,  CA(Ri)  =  en  SRj.  Hence,  a  set  of  nodes 

at  which  aggregating  path-identifiers  produces  the  minimum 
(total)  aggregation  cost,  would  be  a  solution  to  the  above 
problem. 

We  note  that,  if  the  set  of  aggregating  nodes  (routers)  are 
fixed,  the  optimization  problem  of  Eq.  (VI.  1)  is  the  same  as 
the  0-1  knapsack  problem1  which  is  known  to  be  NP-complete. 
In  Eq.  (VI.  1),  however,  the  set  of  aggregating  nodes  and  the 
relative  aggregation  cost  of  a  leaf  node  (  \Rf\1£Rj,  Rj  £  IZj) 
vary  as  aggregation  proceeds  to  the  root.  This  means  the  0- 
1  knapsack  problem  should  be  solved  repeatedly  as  the  set 
of  aggregating  nodes  is  redefined.  We  present  an  efficient 
algorithm  for  this  problem  below. 

C.  Aggregation  Algorithm 

Whenever  aggregation  is  necessary  (i.e.,  |<S|  >  \S\max), 
aggregation  is  performed  as  summarized  in  Algorithm  1.  Let 
O  be  the  solution  set  and  C  be  the  candidate  set.  Initially,  O  is 
empty  and  C  has  all  intermediate  (i.e.,  non-leaf)  nodes  in  T'Rq 
as  its  elements.  Then,  the  algorithm  works  in  a  greedy  fashion: 
for  each  iteration,  the  node  that  causes  the  lowest  cost-decrease 
to  C  is  added  to  O,  and  this  continues  until  the  constraint  on 
the  number  of  path  identifiers  in  Eq.  (VI. 1)  is  satisfied.  Though 
Algorithm  1  is  a  greedy  approximation  algorithm,  it  ensures 
that  the  total  cost  of  the  candidate  set  decreases  minimally 
at  each  iteration.  As  a  consequence,  its  approximation  error 
from  the  optimal  aggregation  cost  is  bounded  by  the  number 
of  incoming  links  of  the  last  added  node  to  O.  We  provide 
the  proof  on  the  error  bound  in  Appendix  B. 


Algorithm  1  Aggregation 

1:  Set  0  =  0  and  C  =  {RfRi  £  Tf0  -  770}. 

2:  Move  the  lowest  aggregation  cost  node  in  C  to  O. 

3:  Ri  £  C  replaces  the  current  solution  set  if  it  satisfies  the 
following  replacement  conditions: 

.  CA(Ri)  >ma,XRj e0CA(Rj) 

4:  Repeat  steps  2  and  3  until  the  constraint  on  the  number  of  path- 
identifiers  (in  Eq.  (VI.  1))  is  satisfied. 


VII.  Simulation  Results 

In  this  section,  we  present  our  ns2  simulation  results  for 
various  attack  scenarios  to  evaluate  our  design.  Network 
topologies  for  simulations  are  configured  to  capture  the  worst 
case  effect  of  different  attacks  and  to  ascertain  how  well  our 

7 — |  can  be  considered  as  the  unit  value  of  an  element,  |7^|  as  the 
size  of  an  element,  and  \S\  —  |<S| max  as  the  knapsack  size  in  the  0-1  knapsack 
problem. 
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Fig.  3:  Topology  used  in  simulation. 

Legend:  “d”  is  the  number  of  sibling  nodes  and  “h”  is  the  tree 
height. 

design  goals  are  satisfied.  The  balanced  tree  shown  in  Fig.  3 
is  used  for  simulations  that  evaluate  the  access  guarantees  and 
the  effectiveness  of  aggregation.  The  unbalanced  tree  is  used  to 
show  that  our  scheme  effectively  provides  access  guarantees  to 
domains  independently  of  their  location  on  a  routing  path.  We 
assign  5%  of  link  capacity  to  the  capability  request  channel  as 
in  [6].  In  most  simulations,  the  total  request  rate  of  legitimate 
sources  is  set  close  to  the  link  capacity  of  request  channel 
(i.e.,  psi  ~  1  for  legitimate  domains)  to  accurately  capture 
the  effects  of  attacks.  Requests  are  randomly  placed  during  the 
specified  simulation  interval  to  approximate  Poisson  arrivals. 

We  compare  our  simulation  results  with  those  of  TVA  [6], 
which  protects  capability  requests  using  a  hierarchical  fair¬ 
queueing  mechanism. 

A.  Link-Access  Guarantees 

To  evaluate  the  local  effect  of  flooding  attacks  in  our 
scheme,  we  use  a  27-path  balanced  tree,  where  30  legitimate 
sources  are  attached  to  each  leaf  node,  and  attack  sources 
are  increased  at  a  leaf  node.  In  this  simulation,  we  set  the 
number  of  access-guaranteed  paths  (|<S|moa:)  to  27  and  the 
buffer  size  to  that  of  108  packets  so  that  4  buffer-slots  are 
guaranteed  to  each  path.  Each  source  randomly  starts  100 
different  sessions  (which  is  equivalent  to  100  times  more 
sources)  between  0  and  10  seconds.  This  source  configuration 
is  used  for  entire  simulations.  We  also  run  simulations  with 
a  TVA  [6]  router  configured  to  have  1000  queues  of  length  4 
(as  TVA  requires  distinct  queues  for  individual  sources  in  the 
current  implementation)  for  comparative  evaluation. 

As  Fig.  4  shows,  the  request  drop  ratios  of  legitimate  paths 
are  stable  over  the  wide  range  of  attack  sizes  with  both  our 
scheme  and  TVA.  That  is,  both  schemes  effectively  localize 
flooding  attacks  when  compared  with  the  no  defense  case. 
Note  that  a  per-client  defense  would  have  the  same  result  as 
that  of  no  defense  when  hots  are  used  to  flood  the  link.  Yet, 
our  scheme  outperforms  TVA  with  a  much  smaller  buffer  (108 
vs.  4000  buffer-slots).  This  is  because  our  scheme  dynamically 
adjusts  virtual-queue  lengths  in  a  min-max  manner,  which  in 
effect  allows  more  than  the  guaranteed  buffer-slots  to  path- 
identifiers  unless  their  bursts  are  synchronized  (in  which  case, 
only  the  guaranteed  buffer-slots  hold). 

To  illustrate  the  robustness  of  the  guarantees  that  our  scheme 


provides,  we  configure  an  extreme  adversarial  scenario  where 
60  paths  of  a  64-path  balanced  tree  (i.e.,  h  =  3  and  d  =  4 
in  Fig.  3)  send  a  large  number  of  requests,  and  observe  the 
service  ratio  of  the  remaining  4  paths.  Fig.  5  shows  the  proba¬ 
bilistic  guarantee  ((7(|<S|,  k,  Si),  viz.,  Eq.  (V.l)),  the  stationary 
service  probability  (P(|<S|,  k,  Sj))8,  and  the  simulation  result 
(Pr(\S\,k,Sc))  for  the  set  of  legitimate  path-identifiers  5  , 
under  specified  bandwidth  utilizations  -  the  ratio  of  request 
rate  to  an  allocated  bandwidth.  Even  under  this  extreme  attack 
scenario,  the  service  ratio  of  legitimate  paths  is  close  to  the 
theoretical  stationary  packet  service  probability,  which  is  much 
higher  than  the  probabilistic  guarantees,  as  illustrated  in  the 
figure. 

Next,  we  show  that  link-access  guarantees  provided  by  our 
scheme  are  independent  of  attack  location.  For  this  simulation, 
we  use  a  40-path  unbalanced  tree  shown  in  Fig.  3.  We 
attach  30  legitimate  sources  to  each  leaf  node,  and  200  attack 
sources  to  each  of  eight  attack  nodes;  four  of  these  nodes 
are  placed  at  different  locations  for  each  simulation  and  the 
remaining  four  nodes  are  placed  at  the  farthest  location  from 
the  flooded  link.  In  this  scenario,  we  simulate  the  queue 
implementation  for  (7(34,  8,  Si),  (7(64, 4,  Si)  and  (7(64, 8,  Sf), 
and  those  for  the  corresponding  4  and  8-slot  queues  in  a 
TVA  router  (i.e.,  4000  and  8000  total  buffer-slots  respectively). 
Fig.  6  shows  the  request  drop  ratios  of  legitimate  paths,  where 
the  horizontal  axis  represents  the  index  of  attack  location  (viz., 
unbalanced  tree  in  Fig.  3).  With  our  scheme,  the  request  drop 
ratios  are  uniform  over  different  attack  locations.  This  means 
our  scheme  provides  almost  same  protection  against  flooding 
attacks  regardless  of  the  attackers’  location.  In  contrast,  TVA’s 
performance  is  highly  dependent  upon  attackers’  location  since 
TVA  assigns  more  buffer  space  to  nearby  domains  (viz.. 
Section  II). 

B.  Differential  Guarantees 

Path-identifier  aggregation,  which  optimizes  domain  band¬ 
width  allocation  when  attack  sources  are  widely  dispersed 
across  domains,  occurs  whenever  the  number  of  active  paths 
(|<S|)  becomes  greater  than  the  number  of  access-guaranteed 
paths  (|5|max).  In  Fig.  6,  the  result  of  the  queue  implementa¬ 
tion  for  (7(34, 8,  Si)  illustrates  the  effectiveness  of  aggregation. 
As  aggregation  increases  bandwidth  allocation  to  legitimate 
paths  by  a  factor  of  (i.e.,  6/34  «  17.6%  in  that 

simulation),  the  request  drop  ratio  of  those  paths  decreases 
76.8%  (from  6.43%  to  1.49%)  when  compared  with  that  of 
the  queue  implementation  for  (7(64,4,  Sfj  (under  which  no 
path  aggregation  occurs).  This  is  far  below  the  stationary  drop 
probability  of  legitimate  paths  (i.e.,  1  —  P(|<S|,  8,  Si)  ~  5.32%) 
which  would  result  when  physically  separate  queues  are  as¬ 
signed  to  those  paths. 

We  also  evaluate  the  effectiveness  of  the  protocol  con¬ 
formance  measure  in  aggregating  attack  paths.  For  this,  we 

8For  k  guaranteed  buffer-slots,  the  stationary  packet  service  probability  of 

p^s  ■  ( t — p  s  ■ ) 

Si  is  determined  by  PfliSI,  k.  SA  =  1 - 1 — i. _ , *  .  This  is  derived  from 

1~PSi 

the  blocking  probability  of  a  M/M/l/k  queueing  system. 


8 


#  of  attackers  p  =  request  rate  /  assinged  capacity 


Fig.  4:  Request  drop  ratio  of  legitimate 
paths.  Error  bars  represent  95%  confi¬ 
dence  intervals. 


Fig.  5:  Request  service  probability  of 
legitimate  paths  with  respect  to  band¬ 
width  utilization  ( p ).  The  solid  hori¬ 
zontal  lines  inside  bars  represent  the 
probabilistic  guarantees  (C/(|<S|,  k,  Si)). 


Fig.  6:  Request  drop  ratio  of  legitimate 
paths  with  respect  to  attack  location  in 
the  unbalanced  tree.  TVA(k)  represents 
the  result  of  TVA  with  queue-length  k. 


%  Attack  Sources  in  Contaminated  Domains 

Fig.  7:  Aggregation  by  protocol  conformance:  The  request 
service  ratio  of  legitimate  paths  increases  as  the  fraction  of 
bots  becomes  higher. 

configure  a  64-path  balanced  tree  such  that  the  same  number 
of  nodes  are  attached  to  leaf  nodes  to  make  the  request  rates  of 
all  paths  identical.  Then,  we  set  \S\max  to  34  (which  limits  the 
number  of  attack  path-identifiers  by  at  most  two)  and  increase 
the  fraction  of  attack  sources  whose  capability  requests  are 
denied  at  the  destination  host,  from  10  to  100%  in  half  of 
the  leaf  nodes.  Note  that  the  bandwidth  conformance  measure 
alone  cannot  distinguish  attack  paths  from  legitimate  ones 
when  the  same  request  rates  occur  in  all  paths. 

As  Fig.  7  shows,  aggregation  is  more  precisely  performed  on 
attack  paths  (which  leads  to  higher  service  ratios  of  legitimate 
paths)  as  the  fraction  of  attack  sources  in  contaminated  do¬ 
mains  grows.  When  domains  are  lightly  contaminated  (i.e.,  the 
fraction  of  attack  sources  is  less  than  40%  in  this  simulation), 
legitimate  paths  can  be  aggregated.  This  is  because  aggregating 
attack  paths  near  the  attack  target  (i.e.,  multi-level  aggregation 
of  those  attack  paths)  produces  a  higher  aggregation  cost  than 
aggregating  legitimate  paths  near  their  origins.  Relatively  high 
cost  of  multi-level  aggregation  also  causes  high  service-ratio 
variation  to  legitimate  paths,  as  a  result  of  imprecise  distinction 
between  legitimate  and  attack  paths. 

C.  Rolling  Attacks 

Another  simulation  we  performed  is  that  of  the  “rolling 
attacks”,  whereby  attack  sources  change  their  location  to 


Time  (sec) 

Fig.  8:  Time  variation  of  goodput  ratio  at  the  congested  link. 
Legend:  Error  bars  represent  the  minimum  and  maximum  of 
goodput  ratio. 

exploit  delays  in  the  response  time  of  any  defense  mechanism. 
For  this  simulation,  we  attach  16  attack  nodes  at  4  different 
locations  in  the  unbalanced  tree  (i.e.,  at  node  1,2,9  and  10)  of 
Fig.  3  and  place  200  attack  sources  in  each  attack  node.  We 
configure  a  rolling  attack  such  that  attack  sources  attached  to 
node  1  and  10  flood  the  target  for  10  seconds  and  the  other 
attack  sources  for  the  next  10  seconds  with  a  20-second  period. 

In  Fig.  8,  we  illustrate  the  time  variation  of  goodput  ratio 
(viz..  Section  VI)  at  the  congested  link  averaged  over  10 
runs.  The  goodput  ratio  is  very  low  at  the  beginning  of 
the  simulation,  since  attack  requests  go  through  the  target 
link  before  being  preempted  by  legitimate  ones.  However, 
as  buffer-preemption  occurs  (as  soon  as  the  buffer  is  filled) 
and  aggregation  starts  (around  t  =  2),  the  goodput  ratio  rises 
sharply.  Changing  attack  location  significantly  decreases  the 
goodput  ratio  as  the  number  of  attack  path-identifiers  at  the 
congested  router  increases  four  times  (i.e.,  from  2  aggregated 
path-identifiers  to  8  path-identifiers).  However,  these  effects 
disappear  whenever  a  new  aggregation  decision  is  made  on 
the  switched  attack  paths  in  A.agg  (which  is  set  to  20-RTT  « 
2  seconds  in  this  simulation). 

VIII.  Internet-scale  Simulations 

In  this  section,  we  present  large-scale  simulation  results  to 
evaluate  and  compare  the  effectiveness  of  different  defense 
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Fig.  9:  Bot  distribution  vs.  the  number  of  ASs. 


mechanisms  (i.e.,  DefAT,  Portcullis  and  TVA)  against  DoC  at¬ 
tacks.  For  this  purpose,  we  construct  network  topologies  using 
real  packet-routes  and  bot  distribution  in  the  Internet.  Then,  we 
compare  the  link-access  times  of  legitimate  capability-requests 
under  different  defense  mechanisms. 

A.  Datasets 

For  Internet-scale  simulations,  we  use  two  real  datasets: 
CAIDA  Skitter-Map  [24]  and  Composite  Blocking  List 
(CBL)  [11],  A  Skitter-Map  contains  the  full  routing -paths 
measured  from  a  root-DNS  server  to  a  large  set  of  randomly- 
chosen  hosts  (300  ~  400  thousand)  in  the  Internet.  A  Skitter- 
Map  is  used  as  a  reference  topology  from  which  simulation 
topologies  are  generated  for  given  attack  sizes.  Among  several 
distinct  maps  (that  are  constructed  at  different  locations),  we 
use  widely  different  topologies  for  simulations,  in  order  to 
observe  the  dependence  of  defense  mechanisms  on  network 
topologies  (more  specifically  the  locales  of  legitimate  and 
attack  sources)  . 

A  CBL  contains  a  list  of  the  IP  addresses  of  active  spam- 
bots.  We  first  cluster  the  IP  addresses  in  the  CBL  by  their  AS 
using  GeoLite  ASN  [25],  and  obtain  a  reference  distribution 
of  bots  (clustered  by  AS)  as  illustrated  in  Fig.  9.  The  figure 
shows  that  300  ASs  are  responsible  for  about  90  %  of  bots  and 
600  ASs  are  for  over  95  %.  When  the  number  of  active  ASs 
is  considered  (which  is  over  35,000),  only  a  small  fraction  of 
ASs  host  most  bots.  This  evidences  the  highly  non-uniform 
distribution  of  bots  in  the  Internet. 

Based  on  this  bot-distribution,  if  the  size  of  simulated 
attacks  is  determined,  we  harvest  attack  sources  from  the 
same  subnet  as  the  one  appears  in  the  reference  topology  (i.e., 
Skitter-Map)  and  place  these  attack  sources  in  the  topology 
such  that  they  have  the  same  bot-distribution  as  the  reference 
distribution.  Then,  we  randomly  choose  legitimate  sources 
from  the  Skitter-Map  and  add  them  to  the  simulation  topology. 
Thus,  the  distribution  of  legitimate  sources  would  be  similar 
with  that  of  AS  sizes  (in  terms  of  the  allocated  IP  address 
space). 

B.  Scenarios 

Our  simulator  runs  in  a  discrete-time  fashion,  where  indi¬ 
vidual  packets  advance  a  single  router-hop  in  a  time  tick.  If  we 
assume  5ms  delay  for  a  single  link  (i.e.,  a  clock-tick  is  5ms), 


the  end-to-end  delay  for  a  source  located  30-hops  from  the 
destination  would  be  150ms.  Routers  keep  the  packets  arrived 
during  a  tick  and  handle  the  packets  according  to  its  admission 
policy.  In  simulations,  the  bottleneck-link  capacity  is  set  to 
1000  requests  per  tick,  which  corresponds  to  2.8Gbps  (i.e., 
slightly  higher  capacity  than  OC-48)  if  5ms  clock-tick  and 
5  %  bandwidth  reservation  for  the  capability-request  channel 
are  assumed.  In  all  simulations,  we  configure  attack  sources  to 
send  10  times  more  capability -request  packets  than  legitimate 
sources,  hence  the  relative  strength  of  attack  at  the  target-link 
would  be  10  times  the  ratio  of  attack  sources  to  legitimate 
ones.  Attack  sources  start  sending  packets  from  the  beginning 
of  simulation  and  keep  flooding  the  target  dining  the  entire 
simulation  interval.  Meanwhile,  legitimate  sources  start  their 
transmission  after  the  target-link  is  fully  congested  to  avoid  the 
case  that  packets  from  closely  located  sources  to  the  target  get 
through  the  link  (i.e.,  are  serviced)  before  congestion  occurs 
and  they  finish  their  transmission.  Since  packet  arrivals  from 
legitimate  sources  are  delayed  proportional  to  their  distance 
to  the  target-link,  these  packet  arrivals  would  have  the  same 
distribution  as  that  of  path  length  (viz..  Section  VIII-C).  Before 
presenting  simulation  results,  we  first  briefly  describe  individ¬ 
ual  defense  mechanisms  (i.e.,  Portcullis,  TVA,  and  DefAT) 
used  for  comparison. 

•  Portcullis:  A  Portcullis  client,  once  identifying  its  ca¬ 
pability  request  being  rejected  (due  to  link  congestion), 
starts  solving  a  computational  puzzle  that  requires  to 
spend  a  certain  amount  of  time  (i.e.,  proves  its  com¬ 
putational  effort)  and  increases  the  puzzle  level  until  it 
receives  a  valid  capability.  In  order  to  solve  a  higher  level 
puzzle,  the  client  needs  to  spend  twice  the  time  spent  for 
the  current  level  puzzle.  Portcullis  routers  prioritize  the 
packets  that  carry  higher-level  puzzle  solutions.  Hence, 
once  a  legitimate  source  solves  a  higher-level  puzzle  than 
attack  sources,  its  request  is  guaranteed  to  be  serviced 
at  Portcullis  routers.  Portcullis  provides  the  best  per- 
source  link-access  guarantee  if  attack  sources  cannot  be 
distinguished  from  legitimate  sources  and  are  uniformly 
distributed  over  the  Internet. 

•  TVA:  TVA  implements  fair  queueing  on  the  incoming 
domains  (i.e.,  ASs)  from  which  packets  arrive.  The 
original  scheme  is  improved  later  to  handle  remotely 
originated  packets,  which,  whether  being  legitimate  or 
not,  become  aggregated  with  others  as  they  proceed  to 
the  target.  For  this  purpose,  TVA  adopts  a  hierarchical 
fair  queueing  mechanism,  where  the  TVA  router  allocates 
a  fair  amount  of  queue  space  to  immediate  upstream 
domains  and  these  queues  are  split  recursively  for  their 
upstream  domains  to  provide  fairness.  Hence,  the  queue 
size  for  a  source  domain  is  determined  by  its  distance  (in 
terms  of  AS  hops)  to  the  target  domain  and  the  number 
of  paths  that  are  aggregated  on  its  path  to  the  destination. 
In  the  simulator,  fair  queueing  is  implemented  for  all  out¬ 
standing  requests  arrived  during  a  time  tick  via  keeping 
all  requests  during  the  interval  and  randomly  choosing 
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excess  requests  that  need  to  be  dropped.  We  assume  that 
all  capability-requests  carry  valid  path-identifiers  though 
path-identifier  authenticity  is  not  considered  in  TVA. 
Hence,  our  implementation  approximates  the  best  fairness 
that  TVA  can  achieve.  In  comparative  simulations,  we  use 
this  advanced  version  of  TVA. 

•  DefAT:  A  DefAT  router  provides  link-access  guarantees 
to  source  ASs  as  explained  throughout  this  paper.  How¬ 
ever,  for  fair  comparative  simulations,  path-aggregation 
is  disabled  at  DefAT  routers  because  it  can  significantly 
favors  the  results  of  DefAT  depending  on  how  we  set  the 
number  of  access-guaranteed  paths.  Though  the  number 
of  access-guaranteed  paths  can  be  optimized  to  maximize 
goodput,  we  leave  it  as  a  configurable  parameter  as 
discussed  before.  We  assume  that  path-identifiers  cannot 
be  spoofed  or  replayed  using  the  security  protection 
mechanisms  provided  in  Section  IV. 


C.  Topology 

We  choose  three  Skitter-Maps  constructed  at  different  loca¬ 
tions  (i.e.,  f-root,  h-root,  and  apan-jp)  and  generate  simulation 
topologies  with  two  parameters:  the  number  of  legitimate 
sources  and  the  number  of  attack  sources.  In  topology  gen¬ 
eration,  we  set  the  number  of  legitimate  sources  to  10K  and 
change  the  attack  size  from  50K  to  300K.  Fig.  10  shows 
the  topology  statistics  (i.e.,  AS-path  length  from  source  to 
destination  and  average  degree  of  ASs  located  at  the  same 
distance  from  the  target)  for  300K  attack  sources.  The  length 
of  AS-path,  which  is  the  number  of  ASs  on  a  path  including 
the  source  AS,  spans  from  1  to  10,  yet  is  mostly  concentrated 
between  3  and  5  in  all  three  topologies.  The  average  AS- 
degree  (i.e.,  the  number  of  immediate  upstream  ASs  that 
send/forward  traffic)  is  widely  different  in  the  topologies, 
hence  it  would  better  characterize  topologies.  Note  that  the 
number  of  paths  (left  vertical-axis)  and  AS-degree  (right 
vertical-axis)  are  shown  in  a  normal  scale  and  a  log  scale 
respectively.  For  example,  f-root  is  constructed  at  an  AS  that 
has  a  very  low  degree  yet  whose  provider- ASs  have  a  very 
high  degree.  On  the  other  hand,  h-root  is  constructed  at  a  high- 
degree  AS  whose  1-hop  and  2-hop  neighboring  ASs  have  high 
AS -degrees  as  well.  Finally,  apan-jp  topology  has  mid-degrees 
both  at  the  target  (where  the  topology  is  constructed)  and  its 
provider.  Simulations  using  these  topologies  would  produce 
different  results  when  a  router’s  defense  scheme  prioritize  traf¬ 
fic  based  on  its  confidence  on  traffic  source  (e.g.,  more  buffer 
allocation  to  closer  domains  in  TVA).  We  note  that  for  different 
attack  strengths,  topologies  do  not  change  significantly  mainly 
because  attack  sources  are  highly  clustered  by  their  locale.  We 
summarize  the  more  statistical  data  on  the  above  topologies 
in  Table  I. 

Legend:  lavg:  average  path  length,  lmax:  longest  path  length, 
lavg'-  average  AS-path  length,  longest  AS-path  length, 

Nr:  number  of  routers. 


TABLE  I:  Topology  statistics. 


lavg 

lmax 

lAS 

Lavg 

lAS 

Lmax 

Nr 

f-root 

14.99 

29 

5.44 

10 

48,624 

h-root 

13.55 

31 

4.84 

10 

42,679 

apan-jp 

17.15 

33 

4.80 

9 

36,621 

D.  Comparative  Simulations 

We  compare  the  link-access  times  of  legitimate  capability- 
requests  with  different  defense  mechanisms  employed  at  the 
target  link.  In  f-root  topology,  DefAT  provides  earlier  link  ac¬ 
cess  to  over  90  %  of  legitimate  requests  than  other  mechanisms 
in  the  presence  of  100K  attack  sources,  and  80  %  of  those 
requests  are  almost  unaffected  by  the  attack  when  compared 
with  the  reference  access  time  curve;  i.e.,  that  of  no  attack 
(viz..  Fig.  11(a)).  With  Portcullis,  all  legitimate  requests  get  a 
link  access  when  legitimate  sources  start  solving  a  higher  level 
puzzle  than  attack  sources.  The  figure  shows  that  about  a  half 
of  requests  are  serviced  at  around  150  tick,  yet  the  remaining 
requests  are  serviced  at  300  tick  as  they  had  to  spend  twice 
the  time  (i.e.,  300  ticks)  to  solve  the  next  level  puzzle.  As  a 
consequence,  the  link  access  time  curve  of  Portcullis  has  two 
sharp  increases  like  a  step  function;  i.e.,  the  link  access-times 
show  bimodal  distribution.  TVA  favors  only  a  small  fraction 
of  legitimate  requests  (about  45  %  of  legitimate  requests  in  f- 
root  topology)  because  requests  from  close  ASs  (to  the  target) 
have  higher  buffer  allocation  than  those  of  remote  ASs. 

In  h-root  topology,  slightly  faster  link-access  times  are 
observed  with  DefAT.  This  is  because  the  path-diversity  in 
this  topology  is  higher  than  f-root  as  can  be  seen  in  Fig.  10. 
Higher  path-diversity  of  legitimate  requests  enables  those 
requests  to  get  more  buffer  allocation  in  a  DefAT  router  as 
DefAT  provides  guarantees  to  individual  source  ASs  (this 
would  further  reduce  the  buffer  allocation  to  highly  clustered 
attack  requests).  Portcullis  provides  almost  identical  link- 
access  times  for  legitimate  requests  despite  topology  changes 
since  its  request  admission  is  primarily  determined  by  clients’ 
computational  effort  (the  level  of  puzzle  that  clients  solved) 
rather  than  simulation  topologies.  With  TVA,  a  significantly 
different  result  is  observed:  only  22  %  of  legitimate  requests 
have  an  earlier  link  access  than  no  defense,  yet  service  to 
remaining  78  %  of  them  are  more  delayed  than  in  f-root 
topology.  This  explains  that  the  effectiveness  of  TVA  is 
highly  dependent  on  the  network  topology.  These  results  are 
consistent  under  different  attack  sizes  (i.e.,  topologies  with 
100K,  200K,  and  300K  attack  sources)  as  Fig.  11,  12  and  13 
show. 

Now,  we  observe  how  the  link-access  times  of  legitimate 
requests  are  affected  by  the  attack  size  by  increasing  the 
attack  size  from  50K  to  300K.  With  DefAT,  about  80  %  of 
legitimate  requests  are  unaffected  regardless  of  the  attack  size, 
yet  the  remaining  20  %  of  requests  (which  originate  from 
attack  domains)  take  longer  time  to  get  through  the  congested 
link  as  the  attack  size  grows.  This  is  an  expected  result  since 
we  do  not  attempt  to  distinguish  legitimate  requests  from 
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(a)  f-root 


(b)  h-root 


Fig.  10:  Simulation  topology  with  300K  attack  sources. 


(a)  f-root  (b)  h-root  (c)  apan-jp 

Fig.  11:  Link-access  time  for  legitimate  capability-requests  under  100K  attack  sources. 


Request  Wait  Time  [in  clock  ticks] 
(c)  apan-jp 


Fig.  12:  Link-access  time  for  legitimate  capability-requests  under  200K  attack  sources. 


attack  requests  if  they  originate  from  the  same  domain.  In 
all  three  topologies,  the  link-access  times  show  a  consistent 
result:  the  link-access  times  of  20  %  of  legitimate  requests 
become  longer  as  the  attack  size  grows  though  they  are  slightly 
differ  in  different  topologies.  With  Portcullis,  the  link-access 
times  are  doubled  if  the  attack  size  reaches  at  a  certain 
threshold.  However,  the  threshold  is  not  proportional  to  the 
number  of  attack  sources  because  attack  sources,  in  order  to 
congest  a  link  with  a  higher  level  of  puzzles  (attack  sources 
solve  a  computational  puzzle  as  well),  should  double  their 
size.  Otherwise,  they  cannot  fully  congest  the  link.  Thus, 
with  Portcullis,  link-access  times  are  highly  dependent  on  the 


attack  size  even  though  they  are  not  on  network  topologies.  In 
contrast,  TVA’s  performance  is  highly  dependent  on  network 
topologies  as  illustrated  in  Fig.  14(c),  15(c)  and  16(c).  This 
phenomenon  can  be  explained  as:  despite  the  high  AS-degree 
of  the  congested  domain  (viz..  Fig.  10(b)),  most  of  legitimate 
requests  are  aggregated  with  attack  requests  if  they  originate 
remotely  (note  that  in  a  limited  size  buffer,  queue  cannot  split 
indefinitely).  TVA  works  well  only  if  the  high-degree  AS  is 
directly  connected  with  or  closely  located  at  the  target  ASs. 
This  is  why  TVA  shows  relatively  better  performance  in  apan- 
jp  topology.  However,  TVA’s  advantage  to  legitimate  sources 
is  marginal  as  TVA  allocates  more  buffer-space  to  closely 
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AS  degree 


located  ASs  regardless  whether  they  originate  legitimate  or 
attack  requests.  In  Fig.  15(c)  and  16(c),  several  leaps  in  the 
CDF  (e.g.,  between  1000  and  1500  ticks  in  Fig.  15(c))  indicate 
more  queue-space  becomes  available  to  legitimate  requests  in 
a  short  time  interval.  This  happens  when  some  legitimate  paths 
disappear  after  finishing  transmission  and  eventually  enable  a 
TVA  queue  to  be  split  into  multiple  separate  queues  for  other 
remaining  paths.  Note  that  a  TVA  queue  splits  recursively 
(towards  the  source  ASs  in  the  traffic  tree)  unless  the  number 
of  distinct  incoming-paths  (i.e.,  immediate  children)  to  the 
queue  (i.e.,  intermediate  node  in  the  traffic  tree)  exceeds  the 
available  queue  size. 

IX.  Conclusions 

In  this  paper,  we  present  a  defense  scheme  against  link 
flooding  attacks  targeting  connection  setups  in  capability  sys¬ 
tems.  Our  design  of  a  new  authenticated  path-identification 
mechanism  provides  individual  packets  with  unforgeable  do¬ 
main  identifiers  to  which  link-access  guarantees  are  provided 
at  remote  routers.  Guarantees  of  link  access,  defined  as  the 
probabilistic  lower  bounds  of  link  access,  are  provided  in  a 
domain  basis  and  they  are  provided  differentially  based  on 
domain  contaminations.  We  show  the  effectiveness  of  our 
design  in  two  ways.  First,  NS2  simulations  support  our  ana¬ 
lytical  results:  (1)  link-access  guarantees  that  are  independent 
of  global  attack  sources  and  their  location,  and  (2)  resilience 
against  attack  dispersion  via  differential  guarantees.  Second, 
Internet-scale  simulations,  using  the  real  network  topologies 
and  hot  distributions,  provide  strong  evidences  on  the  non- 
uniform  distribution  of  hots  and  how  DefAT  localizes  their 
effects  on  legitimate  capability-requests.  More  specifically, 
the  simulation  results  show  that  over  80  %  of  legitimate 
requests  are  unaffected  or  minimally  affected  by  large-scale 
attacks,  which  could  not  be  achieve  with  previous  per-source 
or  per-aggregate  defense  mechanisms.  We  note  that  differential 
link-access  guarantees  would  provide  positive  incentives  to 
administrative  domains  that  employ  strong  security  measures 
against  malware  contamination. 
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A.  Proof  of  Probabilistic  Guarantees 

If  pSi  <  1,  a  packet  carrying  S,  is  guaranteed  to  be  serviced 
if  less  than  k  arrivals  of  5,  have  occurred  in  A q  before  its 
arrival.  Let  Ns^Aq)  be  the  #  of  requests  in  A q.  Then,  the 
probability  of  service  guarantee  on  Si  is  given  as  follows. 


Pr(ASi(AQ)<fc)  = 


> 


Sp  (as.  Aq)jc-as<ac 
j=o  3 ' 

g  (fc  •  PsN  e-k.PSi  (A_1} 


3=0 


j! 


In  contrast,  for  ps,  >  1,  a  per-packet  guarantee  cannot  be 
provided,  since  at  least  psfs  1  of  requests  must  be  dropped 
regardless  of  the  buffer  size.  In  this  case,  only  a  fraction 
of  its  requests  can  be  guaranteed  to  be  serviced  (i.e.,  ^-), 
hence  the  probabilistic  lower  bound  of  link  access  is  defined 
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%  of  Legitimate  Requests  Serviced  %  of  Legitimate  Requests  Serviced  %  of  Legitimate  Requests  Serviced  %  of  Legitimate  Requests  Serviced 

CDF  CDF  CDF  CDF 


Request  Wait  Time  [in  clock  ticks] 
(a)  f-root 


(b)  h-root 


Request  Wait  Time  [in  clock  ticks] 
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Fig.  13:  Link-access  time  for  legitimate  capability-requests  under  300K  attack  sources. 


(a)  DefAT 


(b)  Portcullis 


(c)  TVA 


Fig.  14:  Link-access  time  for  legitimate  capability-requests  under  the  f-root  topology  and  different  botnet  sizes. 


(a)  DefAT 


(b)  Portcullis 


(c)  TVA 


Fig.  15:  Link-access  time  for  legitimate  capability-requests  under  the  h-root  topology  and  different  botnet  sizes. 


(a)  DefAT  (b)  Portcullis  (c)  TVA 

Fig.  16:  Link-access  time  for  legitimate  capability-requests  under  the  apan-jp  topology  and  different  botnet  sizes. 
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as  the  product  of  —  and  the  probability  that  the  allocated 
bandwidth  is  fully  utilized.  Let  Py(51,)  be  the  probability  that 
a  packet  arrival  of  5,  finds  k  buffered  S,s.  The  probability 
of  full  bandwidth  utilization  is  greater  than  Pf(Si).  Let 

Gc  =  EkjZo  Then, 

Pf(Si)  =  1  —  Pr(#  of  Si’s  in  the  buffer  <  k) 


the  maximum  increase  of  aggregation  cost  at  an  intermediate 
aggregator  is  bounded  by  n. 
case  (b): 

By  (B-l),  aggregation  can  occur  at  a  node  if  either  all  its 
children  nodes  are  in  the  solution  set,  or  CA(Ri)  <  h  ,ao 
CA(Roi),  where  O  =  {i?0i,  R0 2, . . .  Ron}  is  the  optimal 
solution  set  and  Ron  is  the  last  node  added  to  the  current 
solution  set. 


>1  -&  +  £ 


(a-  •  i>s,  y  -k-, 


>1  + 

\  j—k 


j  -k  +  1 


j  ~  k  +  1 


(i-gcy-k+1g'r1 
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CA{Rf)<  E  °A(Roi) 

Roieo 
n—  1 

CA(Ri)  <  J2  CA(Roj)  +  CA(Ron) 

3= 1 
n—  1 

CA(Ri)  -  J2  CA(Roj)  <  CA(Ron) 


By  Eqs.  (A-l)  and  (A-2),  the  Eq.  (V.l)  follows. 

Hence,  the  increase  of  aggregation  cost  cannot  be  greater 
B.  Proof  of  Error  Bound  than  the  product  of  &th  and  the  incoming-link  degree  of  the 

We  first  define  two  types  of  aggregating  node.  In  Tf0,  the  ^ast  at^ct'  not^c  t0  ^le  s°luti°n  set. 
node  whose  all  children  nodes  are  leaf  nodes  is  defined  as  the 


“ leaf  aggregator”  and  the  any  other  non-leaf  node  is  defined 
as  “intermediate  aggregator”  The  last  added  node  to  the 
solution  set  can  be  either  a  leaf  aggregator  or  an  intermediate 


aggregator. 

If  the  last  added  node  i?,  to  the  optimal  set  (O)  is  a  leaf 
aggregator,  the  error  from  the  optimal  solution  is  bounded 
by  Z2r  en  £ Rj  —  \Ri\  '  £th,  where  |P,|  is  the  number  of 
incoming  links  of  R, . 

If  the  last  added  node  to  O  is  an  intermediate  aggregator, 
we  can  consider  two  different  cases.  Let  II,  be  an  intermediate 
aggregator,  and  Rn, . . .  ,Rin  be  the  one-hop  children  of  11, . 
By  the  definition  of  aggregation  cost,  the  following  inequality 
can  be  shown. 


CA(Ri)  =  E  >  E^(%)  (B-l) 

'  l'  RjGKi  j= 1 

The  above  inequality  means  that  the  last  node  added  to  O  is 
either  (a)  the  node  whose  all  immediate  children  aggregators 
are  already  aggregated,  or  (b)  the  node  whose  aggregation  cost 
is  less  than  the  total  aggregation  cost  of  the  current  solution 
set. 

case  (a): 


Like  the  leaf  aggregator,  if  aggregation  is  performed  at 
an  intermediate  aggregator  Ri,  the  sum  of  aggregation  costs 
of  Rf  s  children  are  deducted  from  the  total  cost.  Therefore, 


15 


